A Generalized Suffix Tree and its (Un)expected Asymptotic Behaviors
نویسنده
چکیده
Suux trees nd several applications in computer science and telecommunications, most notably in algorithms on strings, data compressions and codes. Despite this, very little is known about their typical behaviors. In a probabilistic framework, we consider a family of suux trees { further called b-suux trees { built from the rst n suuxes of a random word. In this family a noncompact suux tree (i.e., such that every edge is labeled by a single symbol) is represented by b = 1, and a compact suux tree (i.e., without unary nodes) is asymptotically equivalent to b ! 1 as n ! 1. We study several parameters of b-suux trees, namely: the depth of a given suux, the depth of insertion, the height and the shortest feasible path. Some new results concerning typical (i.e., almost sure) behaviors of these parameters are established. These ndings are used to obtain several insights into certain algorithms on words, molecular biology and universal data compression schemes.
منابع مشابه
Suffix Trees Revisited: (Un)Expected Asymptotic Behaviors
Suffix trees find several applications in computer sciences and telecommunications, most notably in algorithms on strings, data compressions and codes. Despite this, very little is known about their typical behavior. We consider in a probabilistic framework a family of suffix trees further called b-suffix trees built from the first n suffixes of a random word. In this family a noncompact suffix...
متن کاملSuffix Trees and Simple Sources
Using an intricate method, Jacquet and Szpankowski [2] compared the depth of insertion into suffix-trees and tries in the non-uniform Bernoulli model, as well as the average size of suffix-trees and tries under the same model. They proved that the depth of insertion has asymptotically the same probabilistic behaviour in both cases, and that the average sizes of a trie and a suffix-tree built wi...
متن کاملSuffix Tree of Alignment: An Efficient Index for Similar Data
We consider an index data structure for similar strings. The generalized suffix tree can be a solution for this. The generalized suffix tree of two strings A and B is a compacted trie representing all suffixes in A and B. It has |A|+ |B| leaves and can be constructed in O(|A|+ |B|) time. However, if the two strings are similar, the generalized suffix tree is not efficient because it does not ex...
متن کاملSpace-efficient K-MER algorithm for generalized suffix tree
Suffix trees have emerged to be very fast for pattern searching yielding O (m) time, where m is the pattern size. Unfortunately their high memory requirements make it impractical to work with huge amounts of data. We present a memory efficient algorithm of a generalized suffix tree which reduces the space size by a factor of 10 when the size of the pattern is known beforehand. Experiments on th...
متن کاملCompact Suffix Trees Resemble PATRICIA Tries: Limiting Distribution of the Depth
Suffix trees are the most frequently used data structures in algorithms on words. In this paper, we consider the depth of a compact suffix tree, also known as the PAT tree, under some simple probabilistic assumptions. For a biased memoryless source, we prove that the limiting distribution for the depth in a PAT tree is the same as the limiting distribution for the depth in a PATRICIA trie, even...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- SIAM J. Comput.
دوره 22 شماره
صفحات -
تاریخ انتشار 1993